4.4 Depth-First Consequences

xkcd DFS comic

Due to the way Prolog uses Depth-First Search, the way we construct our database can have a huge effect on the efficiency and even termination of Prolog queries. ## Efficiency

We previously considered how Prolog would search for Bart’s aunts using the rule

        aunt(A,N) :- parent(P,N), sib(P,A), female(A).

The left-to-right DFS strategy means that Prolog picks a parent P of bart, and then considers all possible siblings of that parent (checking whether each is female) before it is willing to backtrack and try a different parent P.

But suppose we had written the definition as follows:

        aunt(A,N) :- female(A), sib(P,A), parent(P,N).

In terms of formal logic, we haven’t changed anything. The order of formulas in a conjunction doesn’t change the truth value. But in terms of Prolog execution, we have made queries substantially slower.

Suppose we use this new rule to find Bart’s aunts. Now left-to-right DFS means we start by picking the first female A in the database. We consider each person P who has this female A as sibling, and check if any of those P’s happen to be a parent of bart. Once we’ve checked all of A’s siblings, we set A to be the next female in the database, and so on.

This will still find all of Bart’s aunts, but it will be substantially slower! Rather than going from Bart to Bart’s parents to Bart’s parent’s female siblings, we have to look at every female in the database and all of their siblings, just to see if any of those siblings happen to be one of Bart’s parents.

So the moral is, we have to think about how Prolog’s DFS will work when writing our rules if we want our Prolog code to be correct and efficient. ## Termination

In our earlier example of facts about which foods pair well together, there is an annoyance: although Prolog agrees that pairs(apple, walnut) is true, it does not realize that pairs(walnut, apple) is also true, because Prolog only knows what we tell it.

We could just add pairs(walnut, apple) to the database, but it’s tempting to make a general rule:

pairs(apple, walnut).
pairs(apple, honey).
pairs(walnut, avocado).
% ...etc...

pairs(X, X).         % every food X pairs with itself
pairs(_, coconut).   % any food pairs well with coconut
pairs(A, B) :- pairs(B, A).  % if A pairs with B, B pairs with A

This is perfectly logical (and Prolog does now agree that pairs(walnut,apple) and pairs(honey, apple) are true!) but it has a drawback. If we ask Prolog to find a yummy_triple (three foods that pair well together), the left-to-right DFS strategy works as follows:

Prolog first finds that apple and walnut pair, and tries to find a third food to complete the triple.
1. Prolog next finds that walnut and avocado pair.
  - To confirm we have found a valid triple, Prolog checks that pairs(apple, avocado) pair. This isn’t a fact in the database, but our new rule says that this is true if pairs(avocado, apple) is true.
    - Prolog checks pairs(avocado, apple) This isn’t a fact in the database, but our new rule says that this succeeds as long as pairs(apple, avocado) is true.
      - Prolog checks pairs(apple, avocado) … The search has entered an infinite loop; the search will continue forever without finding any valid triples. Prolog thinks it is always making progress, so the DFS never backtracks to reconsider any of the food choices it has already made!

If we had put our new rule first in the database, the situation would be even worse!

pairs(A, B) :- pairs(B, A).  % if A pairs with B, B pairs with A
pairs(apple, walnut).
pairs(apple, honey).
% ...etc...

because now if we query Prolog for a simple fact like

        ?- pairs(apple, walnut).

search immediately enters an infinite loop! The DFS goes as follows:

The first rule in the database that unifies with our goal pairs(apple, walnut) is the symmetry rule, so Prolog tries this rule first: pairs(apple, walnut) is true if `pairs(walnut, apple) is true.
1. The first rule in the database that unifies with pairs(walnut, apple) is the symmetry rule, so Prolog recurses to check `pairs(apple, walnut).
  - Search is now back where it started; Prolog is in an infinite loop and sadly will never discover a proof that pairs(apple, walnut) is true (even though it’s asserted as a simple fact on the next line of our database.) ## Conclusion

Even though Prolog is based on declarative mathematical logic at its core, to use Prolog effectively, we have to structure our our rules carefully and think about how DFS will play out. Just as in more familiar languages like Python or Java, the order of steps in Prolog code has a huge impact on efficiency and correctness.

Extra: Improved Symmetry

If adding a symmetry rule to the database causes infinite loops, what is a better approach?

Here’s the simplest solution: avoid the recursive definition by using different predicate names for the original data and the symmetric predicate, e.g.,

%
% Specific facts about food pairs (renamed predicate)
%
p(apple, walnut).       p(banana, coriander).
p(apple, honey).        p(strawberry, honey).
p(walnut, avocado).     p(strawberry, ginger).
p(walnut, banana).      p(strawberry, tea).
p(apple, banana).       p(tea, walnut).
p(banana, ginger).      p(tea, tomato).
p(banana, cloves).      p(tea, milk).
p(banana, strawberry).

%
% General rules about food pairing (renamed predicate)
%
p(X, X).         % every food X pairs with itself
p(_, coconut).   % any food pairs well with coconut

%
% The pairs predicate is now symmetric but not recursive
%
pairs(A,B) :- p(A,B).
pairs(A,B) :- p(B,A).

Previous: 4.3 Depth-First Search

Next: 4.5 Generate and Test